Goto

Collaborating Authors

 image panel


Systematic Abductive Reasoning via Diverse Relation Representations in Vector-symbolic Architecture

Sun, Zhong-Hua, Zhang, Ru-Yuan, Zhen, Zonglei, Wang, Da-Hui, Li, Yong-Jie, Wan, Xiaohong, You, Hongzhi

arXiv.org Artificial Intelligence

In abstract visual reasoning, monolithic deep learning models suffer from limited interpretability and generalization, while existing neuro-symbolic approaches fall short in capturing the diversity and systematicity of attributes and relation representations. To address these challenges, we propose a Systematic Abductive Reasoning model with diverse relation representations (Rel-SAR) in Vector-symbolic Architecture (VSA) to solve Raven's Progressive Matrices (RPM). To derive attribute representations with symbolic reasoning potential, we introduce not only various types of atomic vectors that represent numeric, periodic and logical semantics, but also the structured high-dimentional representation (SHDR) for the overall Grid component. For systematic reasoning, we propose novel numerical and logical relation functions and perform rule abduction and execution in a unified framework that integrates these relation representations. Experimental results demonstrate that Rel-SAR achieves significant improvement on RPM tasks and exhibits robust out-of-distribution generalization. Rel-SAR leverages the synergy between HD attribute representations and symbolic reasoning to achieve systematic abductive reasoning with both interpretable and computable semantics.


Learning to reason over visual objects

Mondal, Shanka Subhra, Webb, Taylor, Cohen, Jonathan D.

arXiv.org Artificial Intelligence

Despite the centrality of objects in visual reasoning, previous works have so far not explored the use of object-centric representations in abstract visual reasoning tasks such as RAVEN and PGM, or at best have employed an imprecise approximation to object representations based on spatial location. Recently, a number of methods have been proposed for the extraction of precise object-centric representations directly from pixel-level inputs, without the need for veridical segmentation data (Greff et al., 2019; Burgess et al., 2019; Locatello et al., 2020; Engelcke et al., 2021). While these methods have been shown to improve performance in some visual reasoning tasks, including question answering from video (Ding et al., 2021) and prediction of physical interactions from video Wu et al. (2022), previous work has not addressed whether this approach is useful in the domain of abstract visual reasoning (i.e., visual analogy). To address this, we developed a model that combines an object-centric encoding method, slot attention (Locatello et al., 2020), with a generic transformer-based reasoning module (Vaswani et al., 2017). The combined system, termed the Slot Transformer Scoring Network (STSN, Figure 1) achieves state-of-the-art performance on both PGM and I-RAVEN (a more challenging variant of RAVEN), despite its general-purpose architecture, and lack of task-specific augmentations. Furthermore, we developed a novel benchmark, the CLEVR-Matrices (Figure 2), using a similar RPM-like problem structure, but with greater visual complexity, and found that STSN also achieves state-of-the-art performance on this task. These results suggest that object-centric encoding is an essential component for achieving strong abstract visual reasoning, and indeed may be even more important than some task-specific inductive biases.


Drosophila Gene Expression Pattern Annotations via Multi-Instance Biological Relevance Learning

Wang, Hua (Colorado School of Mines) | Deng, Cheng (Xidian University) | Zhang, Hao (Colorado School of Mines) | Gao, Xinbo (Xidian University) | Huang, Heng (University of Texas at Arlington)

AAAI Conferences

Recent developments in biologyhave produced a large number of gene expression patterns, many of which have been annotated textually with anatomical and developmental terms. These terms spatially correspond to local regions of the images, which are attached collectively to groups of images. Because one does not know which term is assigned to which region of which image in the group, the developmental stage classification and anatomical term annotation turn out to be a multi-instance learning (MIL) problem, which considers input as bags of instances and labels are assigned to the bags. Most existing MIL methods routinely use the Bag-to-Bag (B2B) distances, which, however, are often computationally expensive and may not truly reflect the similarities between the anatomical and developmental terms. In this paper, we approach the MIL problem from a new perspective using the Class-to-Bag (C2B) distances, which directly assesses the relations between annotation terms and image panels. Taking into account the two challenging properties of multi-instance gene expression data, high heterogeneity and weak label association, we computes the C2B distance by introducing class specific distance metrics and locally adaptive significance coefficients.We apply our new approach to automatic gene expression pattern classification and annotation on the Drosophila melanogaster species. Extensive experiments have demonstrated the effectiveness of our new method.


OCR-Based Image Features for Biomedical Image and Article Classification: Identifying Documents Relevant to Genomic Cis-Regulatory Elements

Shatkay, Hagit ( University of Delaware ) | Narayanaswamy, Ramya (University of Delaware) | Nagaral, Santosh S. (University of Delaware) | Harrington, Na (Queen's University) | MV, Rohith (University of Delaware) | Somanath, Gowri (University of Delaware) | Tarpine, Ryan (Brown University) | Schutter, Kyle (Brown University) | Johnstone, Tim (Brown University) | Blostein, Dorothea (Queen's University) | Istrail, Sorin (Brown University) | Kambhamettu, Chandra (University of Delaware)

AAAI Conferences

Images form a significant, yet under-utilized, information source in published biomedical articles. Much current work on biomedical image retrieval and classification uses simple, standard image representation employing features such as edge direction or gray scale histograms. In our earlier work we have used such features as well to classify images, where image-class-tags have been used to represent and classify complete articles. Here we focus on a different literature classification task: identifying articles discussing cis-regulatory elements and modules, motivated by the need to understand complex gene-networks. Curators attempting to identify such articles use as a major cue a certain type of image in which the conserved cis-regulatory region on the DNA is shown. Our experiments show that automatically identifying such images using common image features (such as gray scale) is highly error prone. However, using Optical Character Recognition (OCR) to extract alphabet characters from images, calculating character distribution and using the distribution parameters as image features, forms a novel image representation, which allows us to identify DNA-content in images with high precision and recall (over 0.9). Utilizing the occurrence of DNA-rich images within articles, we train a classifier to identify articles pertaining to cis-regulatory elements with a similarly high precision and recall. Using OCR-based image features has much potential beyond the current task, to identify other types of biomedical sequence-based images showing DNA, RNA and proteins. Moreover, automatically identifying such images is applicable beyond the current use-case, in other important biomedical document classification tasks.


Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning

Li, Ying-Xin (Nanjing University) | Ji, Shuiwang (Arizona State University) | Kumar, Sudhir (Arizona State University) | Ye, Jieping (Arizona State University) | Zhou, Zhi-Hua (Nanjing University)

AAAI Conferences

The Berkeley Drosophila Genome Project (BDGP) has produced a large number of gene expression patterns, many of which have been annotated textually with anatomical and developmental terms. These terms spatially correspond to local regions of the images; however, they are attached collectively to groups of images, such that it is unknown which term is assigned to which region of which image in the group. This poses a challenge to the development of the computational method to automate the textual description of expression patterns contained in each image. In this paper, we show that the underlying nature of this task matches well with Figure 1: Samples of images and associated annotation terms a new machine learning framework, Multi-Instance of the gene Actn in the stage ranges 11-12 and 13-16 in the Multi-Label learning (MIML). We propose a new BDGP database. The darkly stained region highlights the MIML support vector machine to solve the problems place where the gene is expressed. The darker the region, that beset the annotation task.